Functional Metagenomics of Biomes
Prediction of the metabolism of bacterial communities based on a single signature gene (such as 16S rRNA gene) is not reliable due to genomic plasticity. Thus, using a metagenomic approach one would be able to achieve a higher confidence when inferring function based on relative abundance of genes for each microorganism. This article provides a brief summary of a recent article by Dinsdale et al. 1 which used a functional metagenomic approach to study the functional diversity of bacterial and viral biomes in 9 different environments or biomes: subterranean (i.e. mines), hypersaline ponds, marine, fresh water, coral associated, microbialites, fish, terrestial animals and the mosquito. __TOC__ Microbial and viral genomic diversity Samples where selected from the sites show in Figure 1 where each number indicates the total number of metagenomes collected at each site, adding up to 45 microbial and 42 viral samples. Pyrosequencing was then conducted using 454 Life Sciences GS20 platforms (with a practical limit of 105bp) to determine the sequence of all samples, followed by a BLASTX analysis with significance threshold of E ''< 0.001. The results were as follows: A total of 14.5 million metagenomic sequences were available from all collecting sites, and after BLASTX analysis, about 1 million sequences from microbial metagenomes and 0.5 million sequences from the viral metagenomes mapped to specific metabolic subsystems or pathways in the SEED database (theseed.org ). Figure 2 below shows a summary of the number of genomes found to participate in each of the general metabolic categories. Figure 3 shows a diversity metric (i.e. Shannon index ''H' = log ©, where C = number of metabolic categories) for the metabolic processes as well as an evenness metric which reflects the number of susystems in each sample (i.e. ''H' ''/ S where S = number of subsystems). Figure 4 shows the most convincing evidence that the metabolic profile of each organism is predictive of the environment where they live. The plots show in this Figure are cannonical discriminant functions (similar to PCA except here you define the number of categories you expect in your data, in this case that number is 9). The most important message from this analysis is that these particular CDA axes describe 79.8% and 69.9% of the combined microbiome and viral biome metabolic variances respectively, making the metabolic profiles in both organisms a highly accurate predictor of the biological character of environment. The authors also report a cross-validation value of 66.7% and 59.9% for microbial and viral metagenomes's ability to correctly predict the environment (dependent variable) which implies that when covering each metabolic category at a time from the total of 9 categories, the remaining 8 can make a correct prediction 66.7% and 59.9% of the times with regard to the position of the "covered" metagenome (independent variable). The last two figures show the relative portion of DNA sequences dedicated to specific cellular processes in different environments and then, more specifically the relative proportion of sequences similar to genes known to be involved in flagellum function, bacterial chemotaxis and gliding motility (top to bottom row of last figure). table1.png corrected_table2.png metaGenomics_Fig1.png metabolic_profiles.png motility_metagenomics.png Ultimately, another important hypothesis supported by these data follows directly from the results of Figure 3, which indicates a low functional evenness for both microbial and viral metagenomes: the frequency of a gene encoding a particular metabolic function reflects its importance in a particular environment, ''and ''during genetic sweeps (or selective sweeps) genes with a higer frequency are favored over changing taxonomy. In other words, variation in gene content between sympatric species (i.e. living in the same space and potentially having gene transfer) is more likely to control gene distribution within an environment rather than leading to a change in taxa for either one of the sympatric organisms. References 1. Dinsdale et al. "Functional Metagenomic Profiling of Nine Biomes", Nature 452, 629-632 (2008)